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Abstract 

We show that any iV-dimensional linear subspace of L 2 (T) admits an orthonormal sys- 
tem such that the L 2 norm of the square variation operator V 2 is as small as possible. 
When applied to the span of the trigonometric system, we obtain an orthonormal system of 
trigonometric polynomials with a V 2 operator that is considerably smaller than the associ- 
ated operator for the trigonometric system itself. 

1 Introduction 

Let T denote a probability space and <5 := {4> n (x)}^ =1 an orthonormal system (ONS) of func- 
tions from T to R. One is often interested, usually motivated by questions regarding almost 
everywhere convergence, in the behavior of the maximal function 



Aif:= max 

i<N 



=1 

For an arbitrary ONS, the Rademacher-Menshov theorem states that <C \og(N)\\f\\ L 2, 

where the log(iV) factor is known to be sharp. One however can do much better for many clas- 
sical systems, for instance one can replace log(iV) with an absolute constant in the case of the 
trigonometric system (the Carleson-Hunt inequality). More recently, there has been interest in 
variational refinements of these maximal results. Define the r-th variation operator 

l/r 



Vf := max V 



where Vn denotes the set of partitions of [N] into subintervals. Clearly, \Mf\ < |V r /| for 
all r < oo. In the case of trigonometric system it has been shown that ||V r /||2 <C H/H2 for 
r > 2 (see p2]), and ||V 2 /|| 2 < ^log(N)\\f\\ 2 (see [8]), where the factor of vdo g^AQ is optimal. 
This later inequality has some applications to sieve theory [9]. The factor of y / log(n) is rather 
unfortunate, leading to inefficiencies in these applications. It is likely that this factor can be 
improved for the functions arising in the applications, for instance, if the Fourier support of / is 
contained in certain arithmetic sets. This is a potential route towards improving the estimates 
in [9]. Some results in this direction can be found in section 7 of [8]. 

In a different direction, it seems that the y / log(n) factor might also be an eccentricity of 
the standard ordering of the trigonometric system. In [8] the following problem was posed: 

Problem 1. Is there a permutation a : [N] — > [N] such that the reordering of the trigonometric 
system $ := {cj) n = e(cr(n)x)} (where e(x) := e 2mx ) satisfies 

l|v 2 /|| 2 << (7ioi(AO)||/|| 2 

for all f in the span of the system? 
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This problem can be thought of as a variational variant of Garsia's conjecture. We refer the 
reader to [2] and [8] for discussion of these and related problems. In support of an affirmative 
answer, it was proved in [8] that given a function / = Yl n =i a ™ e ( nx )> there exists a permutation 
a : [N] —> [N] such that reordered trigonometric system satisfies ||V 2 /||2 "C A/log log(-/V)||/|| 2 . 
There the permutation is allowed to depend on the function, while the above problem seeks a 
permutation that works for all functions simultaneously. 

In this paper, we will study the following related problem. Given an ONS $ := {(j) n (x)}^ =1 
and a N x N orthogonal matrix O = {oj in }i<j jn <7v, we define a new ONS, ^ := {tp n (x)}^ =1 , by 

N 

ip n (x) := y y Oi tn (j)i{x). 
i=l 

This new system will span the same space as the original system. Conversely, every such ONS 
can be obtained from some element of the orthogonal group, 0{N). Let us write $(0) := \£. 
Furthermore, in what follows Q will denote a measurable subset of 0{N) and P[Q] will denote 
the Haar measure of Q. 

Theorem 2. Given an ONS := {cf> n (x)}^ =1 from T to R, there exists an alternate ONS &(0) 
that spans the same space, and satisfies 

||V 2 /l| 2 «0oglog(iV)||/|| 2 (1) 

for all f in the span. In fact, the conclusion holds for all O € Q for some Q C O(N) with 
F[Q] > 1 — Ce~ cN ' 2 5 (for some absolute positive constants C,c). 

If we take <3? := {e(nx)}^ =1 , then this produces an ONS of trigonometric polynomials (span- 
ning the same space as the trigonometric system) with much smaller square variation than the 
trigonometric system. Strictly speaking, Theorem [2] is stated for real valued ONS, but the result 
for the trigonometric system can be obtained by splitting into real and imaginary parts and 
noting the corresponding result holds on each with large probability. We note that Problem [1] 
asks for a similar conclusion where O is restricted to be a permutation matrix instead of just 
an orthogonal matrix. 

Theorem [2] is sharp. Consider an ONS of independent, mean zero, variance one Gaussians, 
{di}iLi- Notice that applying an orthogonal transformation to this system leaves it metri- 
cally unchanged. On the other hand, we have that max^g-p^ Yli^n Erie/ 3 n \ ~ 2iV log log(iV) 
(almost surely) from the variational law of the iterated logarithm [10j . 

Let us briefly outline the key idea in the proof of Theorem [2j In [8] , we proved an estimate 
of the form ([1]) for systems of bounded independent random variables (see Theorem 9). The key 
ingredient in that case is that for every / in the span of the system we have the sub-gaussian 

2 

tail estimate \\f\\g <C H/H2 (where || • \ \g is the Orlicz space norm associated to e x — 1). This 
clearly cannot hold in the setting of Theorem [21 since any 1? function can be in the span of 
the system. However, we will show that a function / in the span of a generic basis $(0) can 
be split / = G + E, where G satisfies a sub-Gaussian tail inequality and E has small L 2 norm 
(decreasing with the size of the Fourier support of /). More precisely, we will prove (note that 
we abuse the notation c below to denote multiple distinct constants): 

Proposition 3. For N fixed, let $ = {<p n {x)}^ =1 be an ONS such that Y,n=i |0n.(^)| 2 < N 
holds (pointwise). There exists Q C O(N) with ¥[Q] > 1 — Ce~ cN2 5 such that for O G Q, 
we have that the associated ONS <3?(0) = {tp n }n=i satisfies the following property. For any 
f = Y^o-n^n, letting m denote support({a n }) (the number of nonzero a% values), we have that 
the function defined by 

f ■■= ^2a n ip n (x) 
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can be decomposed as f := G + E where \\G\\g <C H/H2 and \\E\\2 <C (^r) C H/lh for some 
universal constant c > 0. 

See Proposition [15] below, which gives a stronger maximal form of this statement. The 
condition Yl n =l l^"( x )l 2 — N can usually be removed in applications (such as Theorem 1) by a 
change of measure argument (see Lemma [6]). It seems likely that this decomposition may have 
other applications. 



2 Preliminaries 

We need to define several different norms on the space of functions from T to R. First, for a 

positive constant c, let || • \\g( c ) denote the norm of the Orlicz space associated to the convex 

2 

function e cx — 1. That is, 

" / ife : =S f + {/ eC " /iP - 1 - 1 

When we write || • \\g with the specification of c omitted, we mean c = 1. 
We next define the convex function 



r K (t) := 



e* - 1, \t\ < K 

e K 2 t 2 +e K* (l _ K 2 ) _ 1 | t | >jSr 



and denote the associated Orlitz norm || • ||r K - We then have 
Lemma 4. When K > 1, for all t we have that 

T K (t) < e t2 - 1 

T K {t) < e R2 t 2 . 

It follows that forf:T^Rwe have \\f\\r K < \\f\\g and \\f\\r K < e K ' / 2 \\f\\ L 2. 

Proof. We first prove Tj^(t) < e* 2 — 1 for all t. For t such that \t\ < K, this is clear since 



T K (t) = e* - 1. We consider t such that |t| > K. Then Y K (t) = e K t 2 + e K (1 - K 2 ) - 1, so 
we must show that e R2 t 2 + e R2 (1 — i^ 2 ) < e* 2 . We note that for all real x > 0, 1 + x < e x . 
Applying this to the quantity t 2 — K 2 + 1 > 0, we have: 

e K 2 f 2 + e K* {1 _ K 2 } = e K^ t 2 _ R 2 + l) < f-K* = 

as required. 

We let / be a function from T to R. For any fixed positive real number A such that 
J e l/AI 2 _ 1 < 1 (i.e. A > ||/||g), we have 



f T K {f/\) < f e\f' x \ 2 - 1 < 1, 



since Tx(t) < e* — 1 for all t. This shows that A > ||/||r X j hence 

We next prove Tx(t) < e K2 t 2 . We first consider t such that \t\ > K. In this case, Tx{t) = 
e R2 t 2 + e^ 2 (l - K 2 ) - 1. Since K > 1, we see that e^ 2 (l - K 2 ) < 0, so F K (t) < e R2 t 2 follows. 
For t such that |f| < K, we have r^(t) = e* — 1, so we must show that e* — 1 < e K t 2 for 
lil < iT. 



3 



4 — 1 

We consider ^—^ — as a function of t for t > 0. Its derivative is: 



2 1 f^e* 8 - t~ 3 e t2 + t~ 3 



We observe that this is always non-negative. To see this, consider multiplying the quantity by 
t 3 to obtain 2{t 2 e t — e* +1). Non- negativity then follows from the inequality 1 + xe x > e x 

for all real x > 0. (This inequality can be proved by noting that xe x > e u du.) Hence e f 
is a non-decreasing function of i in the range < t < K, so it suffices to consider the value at 
t = K, which is K~ 2 (e K — 1). Since K > 1, this is < e , as required. 
For / : T -»■ R, we consider A := e^ 2 / 2 ||/|| i2 . Then 



Jr K (f/\)< | e ^/! = ^-||/||| 2 = i, 



since rV(i) < t 2 . Thus, ||/||r K < e^ 2 / 2 ||/|| L2 . □ 
Lemma 5. For any (measurable) f : T — > R, we can decompose / = /i + /2 swc/i iraat 

ll/i|lo<ll/llrx «^ 
||/ 2 |U 2 «e-^ 2 ||/||r x , 

/or some universal constant c > 0. 

Proof. Given /, we define 7 := 2||/||r x to simplify our notation. We then set: 

•ft := f ' Vl<^ and f 2:= f ' l \ l \>K> 

where for a set S C T denotes the indicator function for that set. By definition of 7 = 
2 ll/llr> > \ \f\\r K , we have that 

/ r*(// 7 ) = / (el^l 2 - l) " I|l,<^ + / (e^/ 2 /7 2 + ^(1 - ^ 2 ) - l) ■ K,^ < 1. (2) 
Since this is a sum of two non-negative quantities, this implies 



J//7I 2 , 

L \i\<K 



1 1 < 1 . 



This is equivalent to: 

' e \hh\ 2 - 1 < 1, 



and so||/ 1 ||g<7<||/||r x . 

Again considering ([2]), we also have 



We let [i 
the above as: 



L 




1 





(]A > if) (e^(l - if 2 ) - 1) + I e^/f/7 2 < 1. 



(3) 
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Now, since 

$T K tf/i) < 1 and T K (f/~f)>e K - 1 whenever |// 7 | > K, we must have 



I* 



> K\(e 



K 2 



1) < 1. 



Thus, fjL 



> AJ < K 2 1 • Combining this with ([3]), we have 

/ e K2 f!h 2 < 1 + M (K| > ^) (e^(if 2 - 1) + 1) « 



and hence 



II ^ i|2 ^ 1^2 -/f 2 2 

||/2|| L 2 < A e 7 , 



□ 



implying that H/2IU2 <C e cii " 2 ||/||rK for some universal constant c > 0. 

Finally, we note the following. 

Lemma 6. It suffices to prove Theorem 1 with the restriction that Y2 n =i l < ^«(- z ')l 2 — N. 

Proof. Consider an arbitrary ONS <E> := {(p n }^ =1 and define v(x) = N~ l Y^n=l I'Pnix)] 2 . Fix 
O G 0{N). Define <5 := $(0). Furthermore, consider the ONS \& defined on T (with the 
measure induced by integration against v(x)) by ip n (x) := u~ 1 ' 2 (x)(f) n (x). Furthermore, define 
^ = ^(O). We have the trivial identity 



/max > 

1 £7T 



max y 



7l6l 



z/(a;). 



Thus, the conclusion of Theorem 1 holds for <3? if and only if it holds for fy. However 
12n=i \^n{x) | 2 < N by construction. □ 



3 Probabilistic Methods 

In this section we establish the following result: 

Proposition 7. For N fixed, let {<j) n (x)}% =1 be an ONS such that Yln=i I0n(»| 2 < N - Define 
for each 1 < m < N the function := T n, tt, t, r (the dependence on m is implicit in 

- - J ^!i og (£i og( £ + i)) 

this notation). There exists a subset Q C 0(N) with ¥[Q] > 1 — C(e~ cN2/5 ) such that for all 
O = {oi^ n }i<i : n<N £ Q the corresponding base change of n=\> that is 

N 
i=l 

satisfies the following. For each m in the range 1 < m < N , 




for all vectors a £ R such that support(&) < m. (We use support(a) to denote the number 
of nonzero coordinates of a.) 
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The proof will build on arguments from [2J, although the estimates we obtain are substan- 
tially stronger. We start by establishing a weaker result. For a fixed m in the range 1 < m < N, 
we let § m C denote the subset of vectors b such that ||b||2 < 1 and support(b) < m. We 
then define 



N 



B(m,0) := sup \\^a n ipn\\r*- 

ae§„ 



Note that both the set S m and the function := 
step will be to establish the following: 
Proposition 8. For any 1 < m < N we have that 



n=l 

r 



depend on m. Our first 



E 



O(N) 



B(m,0) < 1 



where the implied constant is independent of m and N . 

This does not quite give Proposition [JJ since there the claim is made with large probability 
and we require the estimates to hold for all m simultaneously. The stronger claim, however, will 
be deduced later from the weaker statement using the concentration of measure phenomenon 
on the orthogonal group. 

We will need the following result. This is Lemma 5.5 from [2J. There it is attributed to 
PP. The result is a concatenation of Lemma 1.10 and 1.12 in [lj. These are due to [3] and [6], 
respectively. 

Lemma 9. Let X and Y be Banach spaces and consider the operator 

N 

T := ^2 0i i( x i ®%') 

for O := (ojj)i<ij<jv G O(N), and where {x*}^ =l (respectively {yj}jLi) are sequences in X* 
(respectively Y ). Then, 

Ca{{x*}f =x ) 



L 



0{N) 



\\To\\< 



N 



Vj 



du + 



Ca({ yj }f =1 ) 



N 



N 



8=1 



du (4) 



where 



a({x*}) := sup{(^ | (x*,x) j 2 ) 1 / 2 : x G X, \\x\\ < 1}, 
«({%•}) := sup{(^ | ( yj ,y*) I 2 ) 1 / 2 : y* G Y*, \ \y*\\ < 1}, 

and {gi}f = i is a system of independent Gaussians with mean zero and variance one. Note that 
the norms in ^ refer respectively to the Banach spaces B(X,Y), Y , and X* . 

Let ^ 2 [iV] denote the set of real sequences a := {a n }^ =1 . We will denote by X the Banach 
space obtained by considering this set with the norm || • ||r m i defined as follows. For a vector 
a, we define ||a||[ m ] to be the infimum of positive c£l such that scaling the convex hull of 
§m by c results in a set containing a. We take Y to be the space of real-valued functions on T 
equipped with the Orlicz norm associated to 

Let x* (1 < i < N) denote the canonical unit vectors in M. N (which is naturally identified 
with the dual space X*). We have, from Lemma [SJ that 



EB(m,0) < 



«(K}f=l) 



N 



r, + 



Ot{{<f>i} 



N 
i=l 



N 



-EllV 



9iXi\\X* 
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In order to establish Proposition [51 we need to show the above is <C 1. This follows from 
the following estimates: 

a(KL=i) « 1, 

/ N / N W 1 / 5 
a({&}£i)< -log - + 1 



m \m 



E|| J3^i|| r , < VN, 



The first estimate above follows from the observation that the convex hull of S m is contained 
in the £ 2 unit ball in R . We will prove the others in the following lemmas. 

Lemma 10. We have that E|| ^5*^illr* *C V~N. 

Proof. Letting C be a positive constant, by Fubini's theorem we have that IE J eEft^W) 5 '/(P N )dx - 
J Ee^ 9 ^ x » 2 / CN dx. Now, for each fixed z, we recall that £V 1^(^)1 2 < N, so £ 9iM x ) is 
a Gaussian random variable with mean and variance at most Thus, jEe^^M^)) 2 /(CN) dx < 
1 for an appropriate choice of C. 

Since e /2 / A < 1 + for A > 1, we have that inf Aeffi >+ | / e^l 2 < 2} < 1 + / e^l 2 . Applying 
this to / = -j= Y2gi4>i, we have 



< / e (i:^M-)) 2 /(cN) dx _ 

r* 



Taking expectations on both sides, we have E|| ^fi^llr* \/iV, as required. 

Lemma 11. We have that a({&}?=i) < (£log + l)) 1/5 . 

Proo/. From Lemma lit follows that ||/|| r , < (f log + l)) 1/5 ||/|| L 2. Now 

||5||r* = sup 77-77; — > 77-n — > 



□ 



/ef.ll/llr. \\9\\r. log + l)) 1/5 N| 2 

» (— logf— + 1^ ' \\g\\ 2 . 

Here we have used that the each element of the dual space T* can be represented as by integration 
against a measurable function. This follows from standard properties of Orlicz spaces. In 
particular, see Theorem 14.2 of [5] since the modulus satisfies the A2 condition. 

It now follows that if ||<?||r* < 1 then \ \gW2 <C log + l)) . Thus by BessePs inequal- 
ity we have 

a({^}) := sup{(^ I {fag) | 2 ) 1/2 : g G K,\\g\\r : < 1} « (-log (- + l))^ , 
which completes the proof. □ 



Lemma 12. We have that E|| Yl 9i x *\\x* < V^y log + l) . 
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Proof. It follows from the definition of X* that 

E Vp* =Esup VftOj 

(Note that taking the supremum over the convex hull of S m would yield the same result.) 

The latter quantity is well studied in the theory of Gaussian processes. Recall that Dudley's 
bound [1] gives 

/•oo 

< / y/\og(Af(§ m ,e))de, 







where JV"(§ m , e) denotes the number of I 2 balls of radius e needed to cover Now clearly § m 
is a subset of the n-dimensional t 2 unit ball, thus log (jV(S m ,e)) = for e > 1, and the above 
quantity is equal to 

/ 0og(AA(S m ,e))de. 

JO 

Lemma [T2l now follows from the following: 
Lemma 13. For < e < 1, we have that 

Af(S m ,e)< 

and thus 



ml \ e 



logAA(S m ,e) < mlog + 1^ +mlog (^j 



Proof. We only prove the first inequality (the second follows by taking logarithms). We let K 
denote the unit £ 2 ball in R m . Then N(K,eK) < (f)"\ where Af(K,eK) denotes the number 
of translates of eK needed to cover K. To see this, consider a maximal set of disjoint balls of 
radius | with centers in K. Let T denote the set of their centers. By maximality, taking balls 
of radius e around each point in T yields a cover of K, and hence the cardinality of T is an 
upper bound on M(K, eK). Now, the union of all the disjoint balls of radius | with centers in 
T is a set with volume equal to \T\vol(^K), where \T\ denotes the cardinality of T and vol(^K) 
denotes the volume of the ball of radius I. Since this set is contained in (1 + %)K, we have 



vol((l + %)K) (1 + 



M{K,eK)< , ( '\ = , e ,in = 1 + - < 
vol(^K) (f) V e/ 

whenever < e < 1. 

Fix m coordinates and consider the associated m-dimensional £ 2 ball. We have shown that 
this can be covered by (f) m balls of radius e. Summing over all (^) such balls completes the 
proof. □ 

This completes the proof of Lemma [T2l and hence the proof of Proposition [8l □ 
3.1 Concentration of Measure on 0(n) 

In the prior section, we proved that for any 1 < m < N we have KQ^B(m, O) <C 1. It follows 
from Markov's inequality that for some large universal C, we have fi(A(m)) > ^, where 

Aim) := {O G 0(N) : B{m,0) < C} 

and /j>(A(m)) denotes the measure of the set A(m) in O(N). 

Consider the Hilbert-Schmidt norm on the set of NxN matrices, ||^4||hs := (Xa<? j<N \Ai,j\ 2 
We recall the concentration of measure inequality on the Orthogonal group (see |11|): 



1/2 



s 



Lemma 14. Let [i denote the Haar measure on the orthogonal group 0(N) and A C 0(N) 
such that /J,(A) > ^. Then, 



AGO(N): inf \\A-B\\ HS >e 
BaA c 



<C e 



-ce 2 N 



for some absolute positive constant c. 

For any N x N matrix M = {rriij}, using the bounds from Lemma E] we have 

/ x 2\ V2 



Ki,n<N 



n \ i 



in 



N /N\\ 1/5 

-log - ||M|| M ||a||, 2 . 

m \ m I I 



(5) 



for all a G R . The final inequality follows from Cauchy- Schwartz. 

Now consider A(m,e) C O(N), defined to be the set of all orthogonal matrices that dif- 
fer from an element of A(m) by a matrix with Hilbert-Schmidt norm at most e. Using ([5]), 



we have that for O G A I m. 



1/5^ 



we have B(m,0) < C, where C is a new ab- 

l/5\ 



solute constant. On the other hand, denoting the complement of A 
( ( V /5 \ X 

A c 

( ~r n \ ) I ) by Lemma 1141 we have 

v 1/5 



by 



O G A c 



m 



in. 



< e 



_ cA r2/5 



for some positive constant c. 

Now to conclude the proof of Proposition [7J it suffices to find a sufficiently high probability 

_l/5> 

set of elements O G O(N) such that for every 1 < m < N we have O *E A I m, 



However, for sufficiently large iV, we see from the union bound that 



U AC \ m \ 

l<m<N \ \ 



1/5N 



m 



JVlog(^) 



<iVe- c7v2/5 «e- c ^ 2/5 . 



This completes the proof of Proposition [7J 



4 Maximal Function Decomposition 



Proposition 15. For N fixed, let {4> n {x)}n=i be an ONS such that J2n=l IMx^' 2 < N. There 



exists Q C 0(N) with 



> 1 - C(e 



-cN 2 / 5 



such that for O G Q the associated system 



^(O) = {^n}n=l satisfies the following property. For any f = ^a n ip n , letting m denote 
support({a n }) , we have that the maximal function defined by 



Mf := sup 

IC[N] 



nel 



9 



can be decomposed as M.J := G + E where \\G\\g <C H/H2 and \\E\\2 <C (^r) C II/II2 for some 
universal constant c > 0. 

To prove this, we fix Q C 0{N) from Proposition [71 We now decompose [N] into a family 
of subintervals according to a concept of mass defined with respect to the a{ values. We define 
the mass of a subinterval I C [N] as M(I) : — y~] v czj \ a n\ ■ By normalization, we may assume 
that M([N]) = 1. We define /o,i := and we iteratively define I k s , for 1 < s < 2 fe , as follows. 
Assuming we have already defined Ik-i,s for all 1 < s < 2 fc ~ 1 , we will define Ik,2s-i and /fc,2s> 
which are subintervals of Ik-i, s - Ik,2s-i begins at the left endpoint of Ik-i,s and extends to 
the right as far as possible while covering strictly less than half the mass of I k —isi while Ik,2s 
ends at the right endpoint of I&_i jS and extends to the left as far as possible while covering 
at most half the mass of Ik-i, s - More formally, we define If. 2s— 1 as the maximal subinterval 
of ifc-1,3 which contains the left endpoint of I k -i,s and satisfies M(Ik,2s-i) < \M{I k _i^ s ). We 
also define I k 2s as the maximal subinterval of I k —is which contains the right endpoint of I k -is 
and satisfies M{I k ^ s ) < jM(I(;_i iS ). We note that these subintervals are disjoint. We may 
express Ife_i jS = 4,2s-i U h,2s U h,s, where i kjS £ h-i, s - In other words, i k)S denotes the single 
element which lies between Ik,2s-l and Ik,2s (note that such a point always exists because we 
have required that Ik 2s— 1 contains strictly less than half of the mass of the interval). Here 
it is acceptable, and in many instances necessary, for some choices of the intervals in this 
decomposition to be empty. By construction we have that 

M{I Ks ) < 2- k . (6) 

We call an interval J C [N] admissible if it is an element of the decomposition given above. 
We denote the collection of admissible intervals by A. We additionally refer to the subset 
{-^fc,s|l < s < 2 k } of A as the admissible intervals on level k and the subset < s < 2 k } 

as the admissible points on level k. We note that every point in [N] is an admissible point on 
some level. (Eventually, we have subdivided all intervals down to being single elements.) 

Now we write Z k := {I kyS : 1 < s < 2 k }. We decompose this asZ£ := {/ G X k : \I\ < 2- k / 2 N} 
and its complement, Z£ := {I dX k : \I\ > 2~ k / 2 N}. Here, |/| denotes the number of nonzero 
values contained in an interval I. 

For J C [iV], we define 

Sj(x) = y^a„.y? ra (x). 



We also define 



S j(x) := max 

ICJ 



y^q w -^ n (x) 



From Lemma [5] and Proposition [JJ we deduce that Sj = Gj + Ej where <C ||Sj||2 

and ||-Ej||2 *C (W) 1 1 ^11 2 f° r some positive constant c'. Our purpose now is to show a similar 

decomposition for Sj(x). Clearly, it suffices to show such a decomposition for a pointwise 
majorant. Denote the decomposition of Sj ks by Sj k3 := G kyS + E kjS , and the decomposition 
of S{ k by Si k := G{ k + Ei k . Setting r = 3, for an interval J we have the following bound, 
where the sums below are restricted to values of k, s such that I kjS , i k:S C J: 
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« ( ? ( ? K*rf +? ( ? K^f) 

+ (E (e i E *,.r) + E (e i E .«..i r ) J = : ^ + ^ < 7 > 

This follows from the observation that for each point x, the maximizing subinterval I C J can 
be decomposed as a union of admissible intervals and points with at most two intervals and 
points on each level. The contribution on each level can then be bounded by a constant times 
the contribution from the "worst" interval/point, which is in turn bounded by the quantity 
inside the sum over k above for each level k. 

For an admissible interval J, we let k* denote the level of J. We note that the sums over k 
in ([7]) range only over k > k* (and the sums over s are also appropriately restricted). Next we 

show that ||Gj||g( c ) <C ||5j||2 for some absolute constant c and ||-Ej||2 <C (iv) ll^lb- 

Now let us estimate 1 1 -E 1 ^ 1 1 2 - We first estimate the contribution from the admissible points 
ik, a £ J- We observe 

|?(? ie -'T 

Since r > 2, this is 

s E (E ' « E (E '. 

where the latter inequality follows from the definition of Ei h s . 

Now since these sums only range over values of k, s such that ik )S £ J, we may split the sum 
over k into two portions as: 

/ \ § fc*+101og(jV) I \ \ / \ I 

E Ei^Jii = E Eii^Ji! + E Eii^Jii • w 

k \ s / k=k* \ s J fc>fc*+lQlog(jV) \ s / 

To bound the first quantity in ([8]), it suffices to observe that the inner quantity for each k is at 
most ||5j||2, and hence its contribution is <C log(A^)||5j||2 <C iV 6 1 1 5 j 1 1 2 , for a constant e < d. 
(Thus we will adjust the value of d for our final estimate by subtracting e.) 

To bound the second quantity in (jSJ), we note that for any £ J with k > k* + 101og(iV), 
we have \\Si k s \\\ < A r_10 ||S'j|||. There are at most N points ik :S in the sum, and thus 

E fEii^Jil) 2 «^ 4 ii^ii 2 - 

fc>fc*+101og(JV) \ s / 

To estimate the contribution from the admissible intervals, we proceed as follows. For each 
k > k* , we define IZ{J) to be the set of admissible intervals / on level k contained in J such 
that |/| < 2-( fc - fc *)/ 2 |J| and we let iJ(J) denote the set of remaining admissible intervals on 
level k contained in J. Note that I%(J) and l\{J) are disjoint, and their union is the set of all 
admissible intervals on level k contained in J. It thus suffices to estimate 

^ + ^:=e( E w) +e( E i^, 

k>k* \l M eJjf(J) J k \i M ei>(j) 



< 



E 



E 



E, 
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Now |i|(J)| < 2( fe - fe *)/ 2 , and we also have 



N 



2-(*-*')/a|| 5j || 2 . 



Since r > 2, we have: 



l/r 



E E i^,. 

fe>fe* \se/£(J) 



1/2 



< 



E E i E - 



fc,s| li 



fc>fc* W£(J) 



«(£)W^«(Hl) 



Next, we recall that / G J£(J) implies |J| < 2-( fc - fe *)/ 2 | J|. We have ||S/ fc J| 2 <C 2-( fc - fc *)/ 2 ||Sj|| 2 , 
thus ||^|| 2 « (M) C ' 2 - C '( fc -^)/2||^ j| 2 (M) C ' 2 -( C ' + i)(^)/2|| 5j || 2 . 



We then have 



l/r 



E E 

k>k* \i k , s ei%(J) 



1/2 



<E E ii^ 

fc>fe* \/ fe , s e/£(J) 



s 1 1 2 



« (jfY^ 2 E 2*-*'2-(^)(*-*-) « (^) C '|I^I| 2 - 

V / fc > fc * V / 

Here we have used the fact that there are at most 2 k ~ k * values of s such that Ik, s C J for each 
k > k* . We can apply this for J = [N] in particular, recalling that \J\ denotes the number 
of nonzero a, values contained in J, which in this case is m. This completes the proof that 
\\E\\2 *C (^) C H/lb for some positive constant d . 

To show that ||G||g( c ) <C H/H2 fo r some universal constant c > 0, we will use the following 
lemma. These implications and arguments are well-known, however we include a proof for 
completeness. 

Lemma 16. Let A denote a fixed, positive constant. For positive constants c, C, we define the 
following sets of measurable functions: 

Si(c) := {/ : T -> R s.t. \\f\\ p < c^A Vp > 2}, 

S 2 ( c , C) := {/ : T -»• R s.t. /i(|/| > A) < Ce" c ^ VA > 0}, 

5 3 (c) :={/:T^Rs.t. ||/|| e(c) < A}, 

where > A) denotes the Lebesgue measure of the subset of x G T suc/i t/iat |/(x)| > A. T/jen 
/or any c > ; t/iere exist positive constants c',C',c" (depending only on c) such that S\{c) C 
S2(c',C) and S'i(c) C £3(0"). Similarly, for any c,C > 0, there exist positive constants c',c" 
(depending only on c,C) such that ^(c, C) C Si(c') and 5 2 (c, C) C S^c"). Finally, for any 
c > 0, t/iere exist positive constants c',C',c" (depending only on c) such that Ss(c) C S^c', C") 
and S 3 (c) C Si(c"). 
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Proof. Fixing c,C, we will determine d such that #2(0, C) C S%{d) (for every ^4). We consider 
an / G 52 (c, C). We consider c' := d\d2 as a product of two variables di,(f 2 whose values will 
be set later. We assume d\ < 1. We have: 

e c'|/| 2 M 2 = / e ^ 2 |/| 2 /A 2 < 1 + fl , , ,.. 



T 



I e d ^ 2 ' A \ (9) 



using the inequality e X//a < ^e x + 1 for all a > 1 and non- negative x (this can be seen by 
considering the Taylor expansion of e x ). 
Now, we observe that 

/ e^' 2 ^ < W e^l 2 M 2 • W |/|2<A2(fe+1) < EMI/I 2 > ^)e^ +1 >, 

fc>(K T fc>0 

where lA 2 fc<|/| 2 <A 2 (/c+i) denotes the characteristic function of the set on which |/| 2 takes values 
between A 2 k and A 2 (k + 1). Since / G 5 2 (c,C), we have /j,(\f\ 2 > A 2 fc) < Ce- c/c for all k > 0. 
Thus, we conclude 

= d 2 i/i 2 M 2 ^ r -ck+d 2 (k+i) _ r d 2 sr p -(c~d 2 )k _ CeC 



-d 2 _ I 

k>0 k>0 



whenever dz < c. Setting d% = c/2, we obtain < Ce c /(e C//2 — 1). Letting d\ = min jl, 6 ^ eC 1 1 , 



we have 



and hence J T e c '^' 2//A2 — 1 < 1 for d = dicfoj showing that / G Ss(c'). Note that c' = d\d2 
depends only on c and C. 

Conversely, we observe that for every c > 0, 53(c) C 5 2 (c, 2). To see this, consider / G 53(c). 
Then we have 

j e c ' ' l ~ - I _£ J /,'■'" {2 <2. 

Thus for any A > 0, 



T 



> A)e cA2/A2 < [ e c ^/ A2 <2. 
Jt 



It follows that / G 6*2(0,2). 

For any c > 0, we will now show there exist c', C such that 5i(c) C 52 (c', C) (for every ^4). 
We consider an / G 5i(c). This means that ||/||p < <fp^A v for allp > 2. Thus, for every A > 0, 
Ml/I > A)A P < (cA) p p2 , which implies 

m > a) < do) 

For a fixed A, we may minimize this quantity over the choices of p > 2. In the case that 



A 



ec 



2 

-gz > 2, we may set p equal to this value, and the quantity in ()10p then becomes: 



A 2 A 2 

c^4\^a^/ A 2 \ 2 «= 2 a 2 



e 2ec 2 A 2 . 



ec 2 >l 2 / 

Hence by setting c/ = we achieve /i(|/| > A) < e ^ c ' A2 M 2 i n these cases. 
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wc 



Now, when < 2, we note that e c ' a2 M 2 > e c '( 2ec2 ) = e 1 . Thus, setting C = e, 

have jtx(|/| > A) < 1 < Ce _c a2 M 2 in these cases. Hence, in all cases we have that 

MI/l>A)<Ce- c ' A2 /^ 2 , 

so feS 2 (d,C). 

Conversely, for any c, C > 0, we will show there exists d such that S2(e, C) C S'i(c') for 

A 2 

every A We consider an / G S^c, C). Then for every A > 0, we have > A) < Ce _c ^". 

We fix p > 2. We observe: 

/*00 /'OO 

=p / A^V(|/| > A)dA «p / A^V^/^dA. 

JO JO 

1 

Substituting A = t?, we see this equals 



We note that identity |T (|) = / °° e sP ds where T denotes the function T(z) := / °° y z e y dy. 

V 

Setting s = 2 t, we see that the quantity in (fTTj) is 



e- sP ds = c-2AP(^)T 



= C~2A P 

By Sterling's formula, T (|) <C p -1 / 2 (J^) 2 • Hence 

||/Hp< Ay^fp^) « .i v > 

as required. 



□ 



Appealing to Lemma [TBI we see that we may bound the quantity ||Crj||<j( c ) by considering 
the p norm. We recall that 



l/r 



k.s I 



+E Ei«V 



l/r 



fe \ s 



where the sums are restricted to values of k, s such that Ik 3 ,ik,8 J- We let A:* again denote 
the level of J, so we are only summing over values k > k* . 
We have 



l/r 



l/r 



E 5>mM + E E i G 't- 



by the triangle inequality, and this is 



< 



E 



l/r 



+E 



Eic,. 



l/r 



E 



Ei^. 



+ E 

p k 



<E Ewe*, 

E fc \ s 



+E £IH g «m 



fc \ a 
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=E(£ii G *..iip) +E(Eii g u;J 

by another application of the triangle inequality. 

Now, using that ||G fc)S || p < \/p\\ s i k J\2 and HG^JIp < V^ll^fcJI 2 b y Lemma US] and 
IIS4JI2 < ||Sj|| 2 2-( fc - fc *)/ 2 and ||5 iM || 2 < ||S/|| 2 2-( fc - fc *)/ 2 , we have ' 

n^iip< E (eii^ii^+E (eii g u;Y /P <^m*E (e 2 ^ )/2 V- 

k>k* \ s J k>k* V s J k>k* V s J 

Since the sum of s ranges over at most 2 k ~ k values (recall we only include values of s such that 
Ik,s Q J) an d r > 2, this is 



« Vp\\Sj\\2 E 2( fc - fc ')('- 1 - 2 - 1 ) « VP||^|| 2 . 



k>k* 

It thus follows from Lemma [TCI that 

||Gj|b( c) «||S J || 2 

for some positive constant c. Lastly, we have that ||Gj|| <C ||Crj||g( c ) from the definition of the 
Orlicz norm. 

5 Proof of the Main result 

We are now ready to prove: 

Theorem 17. Let $ := {<p n (x)}% =1 be an ONS such that En=l 1^0*01 2 < N. Then there 
exists Q C O(N) with F[Q] > 1 - Ce~ cN2/5 such that for O e Q the alternate ONS $(0) 
satisfies 

||V 2 /I|2« 0oglog(iV)||/|| 2 . 

Here we use the mass decomposition (into dyadic subintervals Ij~ >s ) stated previously. We 
use the following easily verified fact (see [8], Lemma 29): 

Lemma 18. For every J C [N], (J ^ (/)) there exist Je,J r G A and ij £ [N] such that 
J := Ji U i j U J r is an interval (i.e. Ji,ij, Ji are adjacent), J C J, and M{J) < 2M(J). 

Without loss of generality, we set ||/|| 2 = 1, and we have the pointwise inequality 
|V 2 /(x)| 2 « E \Si k jB(i k J 2 + E \%J 2 + loglog(iV), 

k,s k,s 

where £?(2& jS ) C T is the set such that \Si ks (x)\ 2 > Cloglog(-/V)M(ifc )S ), for a fixed constant 
C whose value will be chosen to be sufficiently large. Appealing to Proposition [To"l for each 
Ik s we can decompose S hs = G hs + E hs . We then define B G (I k;S ) C T by |G /fc »| 2 > 
gloglog(iV)M(J M ) and B E (I k>8 ) C T by |£/ fe ,»| 2 > § log log(iV)M(4, s ). 
Clearly J g |jSi fc J 2 < 1 is acceptable, so it suffices to show that 



/El^i 



B(/ fe , s )| 2 «l- 
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Now appealing to the decomposition above, we have 

s l B E (I k:S )\ ■ 

fc^s fc^s k^s 

First we estimate 



/ E I^M /EK, 



Employing notation previously used above, we let 1% := {Ik, s s.t. |4 )S | < 2 fc / 2 iV} and if := 
{I kjS s.t. |I fe)S | > 2~ fc / 2 iV}. Thus implies |J| < 2" fc / 2 iv'and '< 2 fc / 2 . We then have 

/Ei^j 2 = / E i%J 2 +/ E i^J 2 - 

Using that J G J« implies |J| < 2~ fc / 2 iV, we have f \E h J 2 < 2- c ' fc / 2 ||S7 fe J| 2 < 2 - fc - c ' fc / 2 . 
Thus 

/ E i^j 2 «E 2 ~ c ' fe/2 « 1 - 

Next, using that < 2 fc / 2 and / \E Iha \ 2 < 2~ fc , we have 



/ E i^,j 2 «E 2 " fc/2 « L 



Finally, we estimate 



/ E \ G h,s l B G (I k , s 



2 

-! 

k,s 



We can choose C sufficiently large so that \Bo(Ik,s)\ log io^ for all k,s (here, \Bc(Ik,s)\ 

denotes the Lebesgue measure). To see this, recall that \\Gi k s \\q( c ) ^ \/M(Ik ;S ). By Lemma 
[T6l there exists a constant c' > such that 

Gj, \>\)< e - c ' A2 /A/(4. s ) 



/' 

for all A > 0. Setting A 2 = § log log(JV)M(4 )S ), we obtain 

I^G(4, s )l«log(iV)- c ' c/10 . 

We can then choose C sufficiently large with respect to d make this estimate <C log i0(jv) • 
Now we split the sum at k = 1001og(-/V) so 



J ^2\ G h,s J B G (I kiS )\ 2 = J E \ G h, a l B G {h,s)\ 2 + j E \ G hJB G (I k 



|2 

fc>1001og(7V) fc<1001og(7V) 



By the Cauchy-Schwarz inequality, 

/ E l^fc, s I BG(/ fe , s )l 2 <Ell^, s ll 2 H^ca 



■ ■ 2 

fc<100 log(TV) 
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Now, by Lemma [TBI we have ||Cr/ fes ||4 <C ||<S7 fc Jl2 ^ 2 k and, by the previous estimate, 



ll-^B G (/ fc s ) 1 1 4 ^ io g 5 (jv) • Thus we have shown that the quantity above is 

fc<10Qlog(2V) 

Lastly, let T C [TV] denote the set of indices appearing in some Jfc )S for k > 1001og(A r ). 
Note that any index will appear in at most TV" such intervals, and that M (I^ s ) < V~ 100 if 
k > 1001og(iV). Thus |a n | 2 < AT- 100 for n G T. Thus we have 



/ £ |G/ m Ib g (, m )| 2 « iV 2 / E K^ n (z)| 2 « iV- 98 / E l^^)! 2 « L 

fe,s neT ^ n£T 

fc>1001og(AT) 

This completes the proof. 
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